LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

نویسندگان

چکیده

Chinese short text matching is a fundamental task in natural language processing. Existing approaches usually take characters or words as input tokens. They have two limitations: 1) Some are polysemous, and semantic information not fully utilized. 2) models suffer potential issues caused by word segmentation. Here we introduce HowNet an external knowledge base propose Linguistic Enhanced graph Transformer (LET) to deal with ambiguity. Additionally, adopt the lattice maintain multi-granularity information. Our model also complementary pre-trained models. Experimental results on datasets show that our outperform various typical approaches. Ablation study indicates both important for modeling.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text-Enhanced Representation Learning for Knowledge Graph

Learning the representations of a knowledge graph has attracted significant research interest in the field of intelligent Web. By regarding each relation as one translation from head entity to tail entity, translation-based methods including TransE, TransH and TransR are simple, effective and achieving the state-of-the-art performance. However, they still suffer the following issues: (i) low pe...

متن کامل

Knowledge Enhanced Hybrid Neural Network for Text Matching

Long text brings a big challenge to semantic matching due to their complicated semantic and syntactic structures. To tackle the challenge, we consider using prior knowledge to help identify useful information and filter out noise to matching in long text. To this end, we propose a knowledge enhanced hybrid neural network (KEHNN). The model fuses prior knowledge into word representations by know...

متن کامل

Linguistic knowledge for specialized text production

This paper outlines a proposal for encoding and describing verb phrase constructions in the knowledge base on the environment EcoLexicon, with the objective of helping translators in specialized text production. In order to be able to propose our own template, the characteristics and limitations of the most representative terminographic resources that include phraseological information were ana...

متن کامل

Chinese Short Text Classification Based on Domain Knowledge

People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, conventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new model to directly measure the correlation between a short text instance and a domain instead of...

متن کامل

Text Classification Using Graph-Encoded Linguistic Elements

Inspired by the goal to more accurately classify text, we describe an effort to map tokens and their characteristic linguistic elements into a graph and use that expressive representation to classify text phrases. We outperform the bag-of-words approach by exploiting word order and the semantic and syntactic characteristics within the phases. In this study, we map tagged corpora into a placehol...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i15.17592